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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
BEFORE THE BOARD OF PATENT APPEALS AND INTERFERENCES 

In re Application of: Bandman et al. 

Title: HUMAN MITOCHONDRIAL MALATE DEHYDROGENASE 

Serial No.: 09/915,694 Filing Date: July 25, 2001 

Examiner: Fronda, C. Group Art Unit: 1652 

Mail Stop Appeal Brief-Patents 
Commissioner for Patents 
P.O. Box 1450 

Alexandria, Virginia 22313-1450 

BRIEF ON APPEAL 

Sir: 

Further to the Notice of Appeal filed on September 30, 2003, and received by the USPTO 
on October 6, 2003, herewith are three copies of Appellants' Brief on Appeal. Appellants hereby 
request a one month extension of time in order to file this Brief. Authorized fees include the 
statutory fee of $1 10 for a one-month extension of time, as well as the $ 330.00 fee for the filing 
of this Brief. 

This is an appeal from the decision of the Examiner finally rejecting claims 3, 6, 7, 9 and 
12 of the above-identified application. 
01/13/2004 H6EBREH1 00000034 090108 09915694 

01 FC:1402 330.00 Dfi (1) REAL PARTY IN INTEREST 

The above-identified application is assigned of record to Incyte Pharmaceuticals, Inc. 
(now Incyte Corporation, formerly known as Incyte Genomics, Inc.) (Reel 9774, Frame 0975), 
which is the real party in interest herein. 
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(2) RELATED APPEALS AND INTERFERENCES 
Appellants, their legal representative and the assignee are not aware of any related 
appeals or interferences which will directly affect or be directly affected by or have a bearing on 
the Board's decision in the instant appeal. 



Claims rejected: 
Claims allowed: 
Claims objected to: 
Claims canceled: 
Claims withdrawn: 
Claims on Appeal: 



(3) STATUS OF THE CLAIMS 
Claims 3, 6, 7, 9 and 12 
(none) 

Claims 4, 5 and 10. 

Claims 1, 2, 8, 11, 13, 17-27, 30-45. 

Claims 14-16, 28 and 29 

Claims 3, 6, 7, 9 and 12 (A copy of the claims on appeal, as 
amended, can be found in the attached Appendix). 



(4) STATUS OF AMENDMENTS AFTER FINAL 
There were no amendments made after final. 



(5) SUMMARY OF THE INVENTION 

Embodiments of the present invention are directed, inter alia, to polynucleotides 

encoding mitochondrial malate dehydrogenase (MT-MDH). In particular embodiments, the 

mitochondrial malate dehydrogenases are selected from among amino acid sequences comprising 

SEQ ID NO:l. These polypeptides have strong chemical and structural homology with known 

mitochondrial malate dehydrogenases. For example: 

MT-MDH is 338 amino acids in length and has two potential N-glycosylation 
sites at residues N-117 and N-145, seven potential casein kinase II 
phosphorylation sites at T-54, S-69, T-109, T-170, S-261, S-309, and S-310, four 
potential protein kinase C phosphorylation sites at residues T-213, T-227, S-326, 
and T-336, a mitochondrial malate dehydrogenase active site signature between 
residues V-169 and V-181, and a transit peptide sequence from residues M-l to 
N-24. As shown in Figures 2A and 2B, MT-MDH has chemical and structural 
homology with murine mitochondrial mitochondrial malate dehydrogenase (GI 
56643; SEQ ID NO:3) and porcine mitochondrial mitochondrial malate 
dehydrogenase (GI 164541; SEQ ID NO:4). In particular, MT-MDH and murine 
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mitochondrial mitochondrial malate dehydrogenase share 94% identity, share both 
potential N-glycosylation sites, six potential casein kinase II sites, three potential 
protein kinase C sites, the mitochondrial malate dehydrogenase active site 
signature, and the transit peptide sequence. As illustrated by Figures 3 A and 3B, 
respectively, MT-MDH and murine mitochondrial mitochondrial malate 
dehydrogenase (SEQ ID NO:3) have similar isoelectric points (pi = 8.8). As 
illustrated by Figures 4 A and 4B, MT-MDH contains potential NAD(H) and 
NADP(H) binding site motifs. Northern analysis shows the expression of this 
sequence in various libraries, at least 49% of which are immortalized or cancerous 
and at least 24% of which involve immune response. Of particular note is the 
expression of MT-MDH in fetal tissues; in cardiovascular, gut, nervous, and 
reproductive tissues; and in secretory and hematopoietic tissues. (Specification at 
page 14, line 29 to page 15, line 17). 

The polynucleotides of the present invention have a variety of utilities. For example, they 
can be used for the diagnosis, prevention, or treatment of vesicle trafficking, immunological and 
neoplastic disorders. (See the Specification e.g., at page 14, lines 18-21) 

(6) ISSUES 

1. Whether claims 3, 6, 7, 9 and 12 meet the written description requirement of 35 
U.S.C. § 112, first paragraph. 

2. Whether claims 3, 6, 7, 9 and 12 meet the enablement requirement of 35 U.S.C. 
§ 112, first paragraph. 

(7) GROUPING OF THE CLAIMS 

As to Issue 1 

All of the claims on appeal are grouped together. 
As to Issue 2 

All of the claims on appeal are grouped together. 

m APPELLANTS' ARGUMENTS 

Issue 1- Written description rejection under 35 U.S.C* § 112, first paragraph 

Claims 3, 6, 7, 9 and 12 were rejected under 35 U.S.C. §112, first paragraph, as allegedly 
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"containing subject matter which was not described in the specification in such a way as to 
reasonably convey to one skilled in the relevant art the inventor(s) at the time the application was 
filed, had possession of the claimed invention." (06/20/03 Office Action, at page 4) In making 
the rejection, the Examiner asserts that: 

There is no disclosure of any particular structure to function/activity relationship in the 
single disclosed species. The specification also fails to described additional 
representative species of these polynculeotides by any identifying structural 
characteristics or properties for which no predictability of structure is apparent. Given 
this lack of additional representative species as encompassed by the claims, Applicants 
have failed to sufficiently disclose the invention in such full, clear, concise and exact 
terms that a skilled artisan would recognize Applicants were in possession of the claimed 
invention. (12/17/02 Office Action, at pages 4-5). 

This rejection is improper, as the claims define subject matter which is described in the 

Specification in such a way as to reasonably convey to one skilled in the art that the inventors 

had possession of the claimed subject matter at the time the application was filed. The 

requirements necessary to fulfill the written description requirement of 35 U.S.C. § 1 12, first 

paragraph, are well established by case law: 

... the applicant must also convey with reasonable clarity to those skilled 
in the art that, as of the filing date sought, he or she was in possession of 
the invention. The invention is, for purposes of the "written description" 
inquiry, whatever is now claimed. Vas-Cath, Inc. v. Mahurkar, 19 
U.S.P.Q.2d 1111, 1117 (Fed. Cir. 1991). 

The Board's attention is also drawn to the Patent and Trademark Office's own 

Examination of Patent Applications Under the 35 U.S.C. £ 1 12, para. 1", published January 5, 

2001, which provide that: 

An applicant may also show that an invention is complete by disclosure of 
sufficiently detailed, relevant identifying characteristics 42 which provide evidence 
that applicant was in possession of the claimed invention, 43 i.e., complete or 
partial structure, other physical and/or chemical properties, functional 
characteristics when coupled with a known or disclosed correlation between 
function and structure, or some combination of such characteristics. 44 What is 
conventional or well known to one of ordinary skill in the art need not be 
disclosed in detail 45 If a skilled artisan would have understood the inventor to be 
in possession of the claimed invention at the time of filing, even if every nuance of 
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the claims is not explicitly described in the specification, then the adequate 
description requirement is met. 46 

Thus, the written description standard is fulfilled by both what is specifically disclosed 
and what is conventional or well known to one skilled in the art. 

A. The specification provides an adequate written description of the claimed 
"variants" of SEQ ID NO:l and SEQ ID NO:2. 

The subject matter encompassed by claims 3, 6, 7, 9 and 12 is either disclosed by the 
specification or is conventional or well known to one skilled in the art. 

First note that the "variant" language of independent claim 3 recites a polynucleotide 
encoding "a polypeptide comprising a naturally occurring amino acid sequence at least 95% 
identical to the amino acid sequence of SEQ ID NO:l" and the "variant" language of independent 
claim 12 recites "a polynucleotide comprising a naturally occurring polynucleotide sequence at 
least 95% identical to the polynucleotide sequence of SEQ ID NO:2." 

The amino acid sequence of SEQ ID NO: 1 and the polynucleotide sequence of SEQ ID 
NO:2 are explicitly disclosed in the specification. See, for example, the Sequence Listing. 
Variants of SEQ ID NO:l and SEQ ID NO:2 are described in the Specification at, for example, 
page 4, lines 13-14; page 7, lines 4-7 and 12-18; and page 15, lines 18-27. 

One of ordinary skill in the art would recognize polynucleotide sequences which are 
variants having a polynucleotide sequence at least 95% to SEQ ID NO:2, or which encode 
polypeptide variants having an amino acid sequence at least 95% identical to SEQ ID NO:l. 
Given any naturally occurring polynucleotide sequence, it would be routine for one of skill in the 
art to recognize whether it was a variant of SEQ ID NO:2, or whether it encoded a variant of SEQ 
ID NO:l. Accordingly, the specification provides an adequate written description of the recited 
polynucleotide variants of SEQ ID NO:2 and polynucleotides encoding polypeptide variants of 
SEQ ID NO: 1. 

1. The present claims specifically define the claimed genus through the 
recitation of chemical structure 

Court cases in which "DNA claims" have been at issue commonly emphasize that the 
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recitation of structural features or chemical or physical properties are important factors to 

consider in a written description analysis of such claims. For example, in Fiers v. Revel, 25 

U.S.P.Q.2d 1601, 1606 (Fed. Cir. 1993), the court stated that: 

If a conception of a DNA requires a precise definition, such as by 
structure, formula, chemical name or physical properties, as we have held, 
then a description also requires that degree of specificity. 

In a number of instances in which claims to DNA have been found invalid, the courts 

have noted that the claims attempted to define the claimed DNA in terms of functional 

characteristics without any reference to structural features. As set forth by the court in University 

of California v. Eli Lilly and Co., 43 U.S.P.Q.2d 1398, 1406 (Fed. Cir. 1997): 

In claims to genetic material, however, a generic statement such as 
"vertebrate insulin cDNA" or "mammalian insulin cDNA," without more, 
is not an adequate written description of the genus because it does not 
distinguish the claimed genus from others, except by function. 

Thus, the mere recitation of functional characteristics of a DNA, without the definition of 
structural features, has been a common basis by which courts have found invalid claims to DNA. 
For example, in Lilly, 43 U.S.P.Q.2d at 1407, the court found invalid for violation of the written 
description requirement the following claim of U.S. Patent No. 4,652,525: 

1. A recombinant plasmid replicable in procaryotic host containing within 
its nucleotide sequence a subsequence having the structure of the reverse 
transcript of an mRNA of a vertebrate, which mRNA encodes insulin. 

In Fiers, 25 U.S.P.Q.2d at 1603, the parties were in an interference involving the 
following count: 

A DNA which consists essentially of a DNA which codes for a human 
fibroblast interferon-beta polypeptide. 

Party Revel in the Fiers case argued that its foreign priority application contained an 
adequate written description of the DNA of the count because that application mentioned a 
potential method for isolating the DNA. The Revel priority application, however, did not have a 
description of any particular DNA structure corresponding to the DNA of the count. The court 
therefore found that the Revel priority application lacked an adequate written description of the 
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subject matter of the count. 

Thus, in Lilly and Fiers, nucleic acids were defined on the basis of functional 
characteristics and were found not to comply with the written description requirement of 35 
U.S.C. §112; i.e., "an mRNA of a vertebrate, which mRNA encodes insulin" in Lilly, and "DNA 
which codes for a human fibroblast interferon-beta polypeptide" in Fiers. In contrast to the 
situation in Lilly and Fiers, the claims at issue in the present application define polynucleotides 
and polypeptides in terms of chemical structure, rather than on functional characteristics. For 
example, the "variant language" of independent claims 3 and 12 recite chemical structure to 
define the claimed genus: 

3. An isolated polynucleotide encoding a polypeptide selected from the group 
consisting of: 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 
95% identical to the amino acid sequence of SEQ ID NO:l. 

12. An isolated polynucleotide selected from the group consisting of: 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at 
least 95% identical to the polynucleotide sequence of SEQ ED NO:2,.... 

From the above it should be apparent that the claims of the subject application are 
fundamentally different from those found invalid in Lilly and Fiers. The subject matter of the 
present claims is defined in terms of the chemical structure of SEQ ID NO:l and SEQ ID NO:2. 
In the present case, there is no reliance merely on a description of functional characteristics of the 
polynucleotides and polypeptides recited by the claims. In fact, there is no recitation of 
functional characteristics. Moreover, if such functional recitations were included, it would add to 
the structural characterization of the recited polynucleotides. The polynucleotides defined in the 
claims of the present application recite structural features, and cases such as Lilly and Fiers stress 
that the recitation of structure is an important factor to consider in a written description analysis 
of claims of this type. By failing to base its written description inquiry "on whatever is now 
claimed," the Office Action fails to provide an appropriate analysis of the present claims and how 
they differ from those found not to satisfy the written description requirement in Lilly and Fiers, 
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2. The present claims do not define a genus which is "highly variant" 

Furthermore, the claims at issue do not describe a genus which could be characterized as 
"highly variant." Available evidence illustrates that the claimed genus is of narrow scope. 

In support of this assertion, the Board's attention is directed to the enclosed reference by 
Brenner et al. ("Assessing sequence comparison methods with reliable structurally identified 
distant evolutionary relationships," Proc. Natl. Acad. Sci. USA (1998) 95:6073-6078). Through 
exhaustive analysis of a data set of proteins with known structural and functional relationships 
and with <90% overall sequence identity, Brenner et al. have determined that 30% identity is a 
reliable threshold for establishing evolutionary homology between two sequences aligned over at 
least 150 residues. (Brenner et al., pages 6073 and 6076.) Furthermore, local identity is 
particularly important in this case for assessing the significance of the alignments, as Brenner et 
al. further report that ^40% identity over at least 70 residues is reliable in signifying homology 
between proteins. (Brenner et al., page 6076.) 

The present application is directed, inter alia, to polynucleotides encoding mitochondrial 
malate dehydrogenases, including polynucleotides encoding mitochondrial malate 
dehydrogenases related to the amino acid sequence of SEQ ID NO:l. In accordance with 
Brenner et al, naturally occurring molecules may exist which could be characterized as 
mitochondrial malate dehydrogenases and which have as little as 30% identity over at least 150 
residues to SEQ ID NO:l. The "variant language" of the present claims recites, for example, 
polynucleotides encoding a polypeptide comprising "a naturally occurring amino acid sequence 
at least 95% identical to the amino acid sequence of SEQ ED NO:l" (note that SEQ ID NO: 1 has 
338 amino acid residues). This variation is far less than that of polynucleotides encoding all 
potential mitochondrial malate dehydrogenases related to SEQ ID NO:l, i.e., those mitochondrial 
malate dehydrogenases having as little as 30% identity over at least 150 residues to SEQ ID 
NO:l. 

3. The state of the art at the time of the present invention is further advanced 
than at the time of the Lilly and Fiers applications 

In the Lilly case, claims of U.S. Patent No. 4,652,525 were found invalid for failing to 
comply with the written description requirement of 35 U.S.C. §112. The '525 patent claimed the 
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benefit of priority of two applications, Application Serial No. 801,343 filed May 27, 1977, and 
Application Serial No. 805,023 filed June 9, 1977. In the Fiers case, party Revel claimed the 
benefit of priority of an Israeli application filed on November 21, 1979. Thus, the written 
description inquiry in those case was based on the state of the art at essentially at the "dark ages" 
of recombinant DNA technology. 

The present application has a priority date of September 3, 1997. Much has happened in 
the development of recombinant DNA technology in the 18 or more years from the time of filing 
of the applications involved in Lilly and Fiers and the present application. For example, the 
technique of polymerase chain reaction (PCR) was invented. Highly efficient cloning and DNA 
sequencing technology has been developed. Large databases of protein and nucleotide sequences 
have been compiled. Much of the raw material of the human and other genomes has been 
sequenced. With these remarkable advances one of skill in the art would recognize that, given 
the sequence information of SEQ ID NO:l and SEQ ID NO:2, and the additional extensive detail 
provided by the subject application, the present inventors were in possession of the claimed 
polynucleotide variants at the time of filing of this application. 

4. Summary 

The 12/17/02 and 6/30/03 Office Actions fail to base the written description inquiry "on 
whatever is now claimed." Consequently, the Office Actions do not provide an appropriate 
analysis of the present claims and how they differ from those found not to satisfy the written 
description requirement in cases such as Lilly and Fiers. In particular, the claims of the subject 
application are fundamentally different from those found invalid in Lilly and Fiers. The subject 
matter of the present claims is defined in terms of the chemical structure of SEQ ID NO:l and 
SEQ ID NO:2. The courts have stressed that structural features are important factors to consider 
in a written description analysis of claims to nucleic acids and proteins. In addition, the genus of 
polynucleotides defined by the present claims is adequately described, as evidenced by Brenner 
et al. Furthermore, there have been remarkable advances in the state of the art since the Lilly and 
Fiers cases, and these advances were given no consideration whatsoever in the position set forth 
by the Office Action. 

For at least the reasons set forth above, the specification provides an adequate written 
description of the claimed subject matter, and this rejection should be reversed. 
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Issue 2 - Enablement rejection under 35 U.S.C. § 112. first paraeraph 

Claims 3, 6, 7, 9 and 12 stand rejected under 35 U.S.C. 1 12, first paragraph allegedly 
for lacking an enabling disclosure with respect to variants of SEQ ID NO:l. The Examiner has 
specifically stated that these claims contain "subject matter which was not described in the 
specification in such a way as to enable one skilled in the art to which it pertains, or with which it 
is most nearly connected, to make and/or use the invention." (06/30/03 Office Action, at page 2- 
3) 

In making the rejection, the Examiner asserts that: 

"The specification does not provide guidance with respect to the specific 
structural/catalytic amino acids and the structural motifs essential for enzyme 
structure and activity function which cannot be altered. Thus searching for the 
specific nucleotides to change (deletion, insertion, substitution, or combinations 
thereof) in a polynucleotide is well outside the realm of routine experimentation 
and predictability in the art" (12/17/02 Office Action, at page 6). 

The Examiner makes a similar assertion in the 06/30/03 Office Action that "[t]he 
specification does not teach the specific amino acids that can be altered and yet still retain 
enzyme activity." (6/30/03 Office Action, at page 3). 

The first paragraph of 35 U.S.C. §112 requires that the Specification describe how to 
make and use the claimed subject matter. That requirement has been met in the present 
application. In particular, the Specification describes how to make and use naturally-occurring 
polypeptide variants of SEQ ID NO:l and polynucleotides encoding such variants. 

Independent claim 3 recites not only that the "variant" polynucleotides encode 
polypeptides that are at least 95% identical to SEQ ID NO:l, but also have "a naturally- 
occurring amino acid sequence" Through the process of natural selection, nature will have 
determined the appropriate amino acid sequences. Given the information provided by SEQ ID 
NO:l (the amino acid sequence of MT-MDH) and SEQ ID NO:2 (the polynucleotide sequence 
encoding MT-MDH), one of skill in the art would be able to routinely obtain a polynucleotide 
encoding a polypeptide comprising "a naturally-occurring amino acid sequence at least 95% 
identical to the amino acid sequence of SEQ ID NO:l." Likewise for the "variant" 
polynucleotides defined by independent claim 12: "a polynucleotide comprising a naturally 
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occurring polynucleotide sequence at least 95% identical to the polynucleotide sequence of SEQ 

ID NO:2." For example, the identification of relevant polynucleotides could be performed by 

hybridization and/or PCR techniques that were well-known to those skilled in the art at the time 

the subject application was filed and/or described throughout the Specification of the instant 

application. For example: 

The terms "stringent conditions"or "stringency", as used herein, refer to 
the conditions for hybridization as defined by the nucleic acid, salt, and 
temperature. These conditions are well known in the art and may be 
altered in order to identify or detect identical or related polynucleotide 
sequences. Numerous equivalent conditions comprising either low or high 
stringency depend on factors such as the length and nature of the sequence 
(DNA, RNA, base composition), nature of the target (DNA, RNA, base 
composition), milieu (in solution or immobilized on a solid substrate), 
concentration of salts and other components (e.g., formamide, dextran 
sulfate and/or polyethylene glycol), and temperature of the reactions 
(within a range from about 5°C below the melting temperature of the probe 
to about 20°C to 25°C below the melting temperature). One or more 
factors be may be varied to generate conditions of either low or high 
stringency different from, but equivalent to, the above listed conditions. 
(Specification at page 13, lines 11-21) 

In one aspect, hybridization with PGR probes which are capable of 
detecting polynucleotide sequences, including genomic sequences, 
encoding MT-MDH or closely related molecules, may be used to identify 
nucleic acid sequences which encode MT-MDH. The specificity of the 
probe, whether it is made from a highly specific region, e.g., 10 unique 
nucleotides in the 5' regulatory region, or a less specific region, e.g., 
especially in the 3' coding region, and the stringency of the hybridization 
or amplification (maximal, high, intermediate, or low) will determine 
whether the probe identifies only naturally occurring sequences encoding 
MT-MDH, alleles, or related sequences. (Specification at page 37, line 25 
to page 38, line 3) 

Probes may also be used for the detection of related sequences, and should 
preferably contain at least 50% of the nucleotides from any of the MT- 
MDH encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and derived from the nucleotide sequence 
of SEQ ID NO:2 or from genomic sequence including promoter, enhancer 
elements, and introns of the naturally occurring MT-MDH. (Specification 
at page 38, lines 4-8) 

See also Example VI, at page 50. 
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Thus, one skilled in the art need not make and test vast numbers of polypeptides that are 
based on the amino acid sequence of SEQ ID NO: 1. Instead, one skilled in the art need only 
screen a cDNA library or use appropriate PCR conditions to identify relevant 
polynucleotides/polypeptides that already exist in nature. By adjusting the nature of the probe or 
nucleic acid (i.e., non-conserved, conserved or highly conserved) and the conditions of 
hybridization (maximum, high, intermediate or low stringency), one can obtain variant 
polynucleotides of SEQ ID NO:2 which, in turn, will allow one to make the variant polypeptides 
of SEQ ID NO:l recited by the present claims. Furthermore, the Specification sets forth an assay 
for measuring malate dehydrogenase activity (Example X at page 52, lines 11-18). 

Accordingly, the document cited by the Examiner in the 12/17/02 Office action relating to 
structure-function relationships in proteins is simply not germane to whether one can make and 
use the polypeptide variants recited by the present claims [i.e., Attwood et al. (Comput. Chem., 
25(4):329-339, 2001)]. Likewise, the cited document relating to alleged difficulties in assigning 
protein function based on homology comparison is not relevant to making the claimed 
polynucleotide variants [Le. 9 Ponting (Brief. Bioinform., 2(1): 19-29, 2001)]. That is, regardless 
of the precise functional characteristics of the SEQ ID NO:l and SEQ ID NO:2 variants, one can 
still make the claimed polynucleotide variants using the disclosure provided by the present 
Specification. The polynucleotides could then be used in, for example, diagnostic testing, drug 
discovery, expression profiling, etc., as discussed in the Bedilion Declaration, filed April 16, 
2003. 

Furthermore, the Board's attention is also directed to the enclosed reference by Brenner et 
al. ("Assessing sequence comparison methods with reliable structurally identified distant 
evolutionary relationships," Proc. Natl. Acad. Sci. USA (1998) 95:6073-6078). Through 
exhaustive analysis of a data set of proteins with known structural and functional relationships 
and with <90% overall sequence identity, Brenner et al. have determined that 30% identity is a 
reliable threshold for establishing evolutionary homology between two sequences aligned over at 
least 150 residues. (Brenner et al., pages 6073 and 6076.) Furthermore, local identity is 
particularly important in this case for assessing the significance of the alignments, as Brenner et 
al. further report that ^40% identity over at least 70 residues is reliable in signifying homology 
between proteins. (Brenner et al., page 6076.) 
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Claim 3 recites, inter alia, a polynucleotide encoding a polypeptide comprising "a 
naturally occurring amino acid sequence at least 95% identical to the amino acid sequence of 
SEQ ID NO: 1." In accordance with Brenner et al, naturally occurring molecules may exist which 
could be characterized as MT-MDH proteins and which have as little as 30% identity over at 
least 150 residues to SEQ ID NO:l. The "95% variants" recited by the present claims have a 
variation that is far less than that of all potential MT-MDH proteins related to SEQ ID NO:l, i.e., 
those MT-MDH proteins having as little as 30% identity over at least 150 residues to SEQ ID 
NO:l. Therefore, one would expect the SEQ ID NO:l variants recited by the present claims to 
have the functional activities of a MT-MDH protein. 

As set forth in In re Marzocchi, 169 USPQ 367, 369 (CCPA 1971): 

The first paragraph of § 112 requires nothing more than objective enablement. 
[emphasis added] How such a teaching is set forth, either by the use of illustrative 
examples or by broad terminology, is of no importance. 

As a matter of Patent Office practice, then, a specification disclosure which 
contains a teaching of the manner and process of making and using the invention 
in terms which correspond in scope to those used in describing and defining the 
subject matter sought to be patented must be take as in compliance with the 
enabling requirement of the first paragraph of § 112 unless there is reason to doubt 
the objective truth of the statements contained therein which must be relied on for 
enabling support. 

Contrary to the standard set forth in Marzocchi, the Examiner has failed to provide any 
reasons why one would doubt that the guidance provided by the present Specification would 
enable one to make and use the recited variants of SEQ ID NO:l or SEQ ID NO:2. Hence, a 
prima facie case for non-enablement has not been established with respect to the variants of SEQ 
IDNO:l orSEQIDNO:2. 

For at least the above reasons, reversal of this rejection is requested. 
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(9) CONCLUSION 

For at least the reasons set forth above, reversal of the rejections under 35 U.S.C. § 112, 
first paragraph, based on lack of an adequate written description and on lack of enablement, is 
respectfully requested. 

If the USPTO determines that any additional fees are due, the Commissioner is hereby 
authorized to charge Deposit Account No. 09-0108. 

This brief is enclosed in triplicate. 

Respectfully submitted, 
INCYTE CORPORATION 

Date: 
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APPENDIX - CLAIMS ON APPEAL 

3. An isolated polynucleotide encoding a polypeptide selected from the group consisting 
of: 

a) a polypeptide comprising the amino acid sequence of SEQ ID NO:l; and 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 95% 
identical to the amino acid sequence of SEQ ID NO: 1. 

6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 3. 

7. A cell transformed with a recombinant polynucleotide of claim 6. 

9. A method for producing a polypeptide encoded by a polynucleotide of claim 3, the 
method comprising: 

a) culturing a cell under conditions suitable for expression of the polypeptide, wherein 
said cell is transformed with a recombinant polynucleotide, and said recombinant polynucleotide 
comprises a promoter sequence operably linked to a polynucleotide of claim 3, and 

b) recovering the polypeptide so expressed. 

12. An isolated polynucleotide selected from the group consisting of: 

a) a polynucleotide comprising the polynucleotide sequence of SEQ ID NO:2, 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
95% identical to the polynucleotide sequence of SEQ ED NO:2, 

c) a polynucleotide having a sequence complementary to a polynucleotide of a), 

d) a polynucleotide having a sequence complementary to a polynucleotide of b) and 

e) an RNA equivalent of a)-d). 
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4 Cbothi. C (1995) y. Mol. Biol. 247 53(L.54aT T^. 1' T< 

««Wr Enzymcl. 266, 460-480], facta rPea7,„ B VH 

-f.ii "r ] aBd thelr scor '"g schemes. The error rate 
of .11 algor.lh.ns i, greatly reduced by using suUstica7sco«s 

JSS^lL !!! S, " iiliC " ' COrT5 °' ^SEARCH and fast a fur 
2KSL?h ' 0fr " Se ^'^ roUBd "> o»r testwgree" 

" tl ' W1 " 1 ,he " or » "Ported. However, the P-values reponed 

b^ro^^ 

oD.-b.lf of the relationship, between „roteu» wYth 20 ^ 
identity .re found. Because many bomolog, haVe ^n^l 
s.m.lar,ty, most distant reUtion.hip, cannot bVTetecud E 
any pairwis. comparison method; b^ B l ,^ * 
identified may be used with confiaencT 

bZ-K nC f da l aDa f e SCarChn,g P ,avs a ro,e » virtually every 
branch of molecular biology and is crucial for interpreting 

mS" '"""I* ° rth fn>m gCn0me C^n the 

k f S Ce ^ „ r0le - " is sur P™ n 8 'hat overall and rehiK, 
capab, hues of different procedures are large* unknot lul 
difficult to verify algorithms on sample data bec^uTe h ! 

Zh" ,arge v di " a SCB ° f P"" 6 ™ wh °* evo a u r i0 "a r v 
Z?*T knOWn un «"biguousry and independently of the 
me hods bemg evaluated. However, nearly*,!! known ho 
mologs have been identified bv sequence analysis (the me.hnH 
to be tested). Also, i, is generally v\ry d^cuh , 0 W „ *e 
absence of structural data, whether two proteins tha" he'k cltr 
sequence similarity are unrelated. This h«^e« thaf al 
though prevous evaluations have helped improvTseouence 
companson they have suffered from insuff.cienTSectlv 
characterized, or artificial test data Assessment i k! i! • 
problematic because high quality S^SSS£!SSZ 
S'f »y (detection Sk£££% 
specificity (rejection of unrelated proteins) however h«* 

sscsst as- - - ^ - 

l, UiC. ,1754 „ „ " 



Sequence comparison methodologies have evolved «„;-<iv 
so no previously published tests has ewluate* l^ZJ P >' 

Fo^mS'hT'L^^ Saps » ou ' kn °^^. 
thr«hoK co^em^m 0 PUb 'i. hed of 
-age id™^ P~ 
measures have never actually been^uatd ™ "S™* 
bases of real proteins Moreover^ J*/ large da,a ' 

commons 2££r ««1 

parison work" That b whaTf^, Se<,Uence com - 

£ 1 l SeqUt ° Ce comparison methodologi« 

the three mos, commonly used props' o? ,hes e °.he Sml^ 
Waterman algorithm (8) implemented in ssearch (3) L" It 

proems ""^ m " em parame,e « «» «£X theie 

Abbreviauon: EPQ. errors per qucrv 

*To whom reprints requests should be addressed c L k 
hyper.stanford.edu. aaoressed. e-mail: brenner<^ 
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superfarnilies. Pearson found that modern matrices and "In. 
scaling of raw scores improve results considerably. He also 
rig0r0US s «'*- w ««nMn algorithm worked 

Very large scale analyses of matrices have been nerfe™.,* 
0) and Henikoff and Henikoff (1,) also eX£d™hf 

StTT^S* 37 a " d FAST * ^ with bus? 
considered the ability to detect homologs above a predeter- 

reported large numbers of spurious matches. The Henikoffa 
searched the swiss-prot database (12) and used reSrenS 
to define homologous families. Their m^S^SSSl 
blosum« matm (14) performed marked* bene' than Z 

, 0 ^JT , ,h aJ . , ^ CI °f T aSSe " ment *«a that are used 
to test the ability of the program to find homoloes. BuUn 
Pearson's and the HenikoftV evaluations of sequence wm" 
parison. the correct results were effectively unknwro Thisb 

whTch a rt y ^' n8 ,he , Mm ! $equence ^Parison meK 
ZJ.lL S evaluated. Interdependent of data a^d 

methods creates a "ch cken and ees" Lu m "i , fl 
example, that new m^^o^^^^^ 
identifying homologs missed by older 

— oglobulin variable and const^^ J are S 
homologous, but p,r places them in different superfamite 
Tne problem B w,despread: each superfamilv in PIR4S 00 whh 
a structural homolog is itself homologous to an averaee 0 H 6 
other pir super families (16). ° an average of 1.6 

Jo surmount these sorts of difficulties. Sander and Schnei- 
der (17) used protein structures to evaluate sequence com 
pansor, Rather than comparing different sequence 5£E 
»n algorithms, their work focused on determining aTnnh 
dependent threshold of percentage identity, above which a. 
proteins would be of similar structure. A re^l, oflL analvsi 

oCeVSo ^^T: " S,a,CS «■"»*■ *£ fdeS 
over 80 residues w,H have similar structures, whereas shorter 
alignments requ.re higher identity. (Other studies also hav e 
used structures (18-20). but these focused on a smalt number 

1Z ^r ,emS WW P ™ Cipa,, y orie ""° towart Tva" 
uanng alignment accuracy rather than homology detection 

A general solution to the problem of scoring comes from 
statical measures (i.e.. E-values and P-valu^eTon ThT 
extreme value distribution (21). Extreme value «orin, wa! 
implemented analytically in the bust program uTinf T e 

chJ^ M ? hUl " a,iStiCS (2Z *> and ap- 
proaches have been recently added to facta and ssearch E 
addition to being heralded as a reliable means ofSSSiM 

tabih y of statistical scores "is a crucial feature of the blast 
algorithm "(1). Tne validity of this scormg procedure has 

r«. m). However, all large empincal tests used random 
sequences tha, may lack the subtle structure found^iS 
bio ogical sequences (26. 27) and obviousry do not contain anv 
real homologs. Thus, although many researched ^ have su/ 

ica J £. JJ!T. * e " 00 i 8rge rig0rous e «Pe^ents on biolog 
su£r?or " ennU,e ,hC degree ,0 Which $uch unkings ar 8 e 
A Database for Testing Homology Detection Since the 

JefvTmTla! a n ,be i ,r r UrK ° f hem °^ 0bin ^^voSn 
very similar though their sequences are not (29) it has been 
apparent that comparing structures is a more po^e flujf f" 
convenient) way , 0 recognize distant evolut.^na" re la ioT 
ships than comparing sequences. If two proteins showTh?ph" 
degree of similarity in trTeir structural deta^s an VfuTctL n 8 J 
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us« are known The scop databa* 

A KiMSf bS - j™* of 

databases. One (pdbooi^ k^hITJ } ( t 0) * nd CTcaicd ^© 
r "DB90D-B) has domains, which were all <-onev 

■£S l Th7da^ 

pro.ei„ domain in scoP^The" S^JSJ? i °?' ng ^ 11 
highest qualirv domain was sele«L^ f nd "^"f a lut. The 
diabase and removTdT^e to^ ^ mC HT n 
(and dtaded, were al. tt*^'SSl?SSS 
level of identity to the selected dnm»;J Ttv- "^hold 
repeated until 'the list w« ^nnT^f 77115 P~cess n, 
contams , J2 3 domSs. ^hichTve?rL"n D ' B H da,abaSe 
distant relationships, or Hl5» Jft SSfi wfi ^ °i 
pairs. In pdbwd-b. the Z079 domai^ S3 i?4ES ^ 

Analyses from both databases were .. n »^«~ P . 
PDB40D-B focuses on distantl^relMed^pweu^s^nd^^ifces the 
fam^esT,?^"^ 0 " fa ,he PDB of a ^' of 

Ahhn, hom °'og results here are from PDB40D-B 

of se- 

son a, for) , hms ..^XK^'S SS£ 

r ^' n h ^ l ?r e,u,B,,jr ,demifted ■«*J3k 

BJi 8 ^ IVSeS f^ 81 ^ (1). version 1.4.9MP. and wu- 
blast: (2). version 2.0a 13MP. Also assessed was the ^ast* 

SSSl'Sr: 3 0176 (3) - WhiCh P rovided facta amUhe 
SSEarch and fact " ° f Sm »h-Waterman (8). 

SJEarch and Fasta we used BLOSUM45 with gap penahies 

SUM6-.1 were ufeH ? defaU " Parameters and matrix (BLO- 
sums.) were used for blast and wu-blast; 

„ " ed pa,rs of ouerv and target sequences with 
scores, from best to worst. The ideal method wouW £ 
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Fig. 1. Coverage v$ error plots of d'ff C owtr ay t ^ 

»me fold divided by the total number of pain f rom , been de,e «°- Preoselv. u!,^ i£mh«v2 J he * Mu ,ndi ««« «>' <r«tio» of 
.de„„f Icauon of 904 relationshtps. TkcTZl ^TT^ PDB40M "mains *o,„ orTola £ T" d pai " of P"" e »» •«•» «» 

accuracy wh.ch may be desired. The scores tha7™~~ V " U B P' 0 *"*** on a lot scale u> ,hoT~u!« madc ,he «»-«-all 
demonstrates the trade-off between tm£%^Z£F* ? Ihe lev "» of EPO and coverage ,« .T^ wide,v v »™g degrees o 

up,. Th e jdMl me , nod wou , d be ffi^*' ft* «»• homolop are f ou^o* „,"o ^HT.M , * >nd Tab,e 1 ™« l»£ 

perfect separation. with a || of the h omftI „« . ' COmpar,ton W»- 



perfect separation, with all of the homoloss at the .«« „f ,,. 
« and unrelated proteins below. I„ '1'^ ° f 

Hon is impossible to achieve so inst^rf 1! — Separa- 

Our procedure involved measuring the covera« an n 




cever Operating Character^tic rRno'^' fca,ures of Re " 
better represent the hS d«ree S 2 1?°* (33> 345 bu « 
sequence compartson and the Su,re aey requ,red in 

molop. Ine hu ? e background of nonho- 

4551^^.^1^^ f " e ™ «• P-tical 
information necessar , Srtonf . riVET** pfecise, y ,h « 
search. Tne EPQ meLuS ™ a IT da,aba * 
•ency: that is. it requ.res sco r « ^ h P Um ° n seore cons »- 



P """ t """^ « (POB900.B, 



h*h degree 2 '"he? ^laZZl^ 0 ^ ^ ^ 
proteins are not related Am™£?.?.i ? g - luwe *' ,ha ' 
score of 85 nor the E-valu ? K f " ,,lh, L lhe raw •*»*»•« 
Rasmoi (40) 1 J " "8 n,f, «'><- Proteins rendered by 




100 

AllsnmMil tenptn 

pro'em! ,„ W^LT^ZlZr f °' """'^ 

ssearch B plotted « a «mTwho«ZT ? ° J USp,0,emS,ouno "''h 
«h« percentage «hn.„vT „7 t".^^ " "* " n *' h " d 
length and pereenu.e .den. ,v , re '"''fZ Be " U " a " ?nmenl 
may have exactly the' same .lignmenUenm ^ °' pr0,e,nt 
The line shows th e HSSf threshold mkJ ! P ereema ?« 'dentirv. 
*"h a differen, ma.r" ^ ^nd pa^'.^f h " Um " nde<1 ,ot " W'-ed 
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Fic 4. ReUabiiiiy of statistical scores in pdwod-b: Each line shows 

™ e ?i ,U .°^ P bCtW " n re P°" cd "ore ami acVu,, *™ 

rate for i different program. E-vaiues are reported for sseakch and 
Fasta. where., P-v,i ues ,re shown for biast and wu-bLSt If tne 
scoring were perfect, then the number of errors per query and the 
E-values would I be the s«ne. as indicted by the up£r^M !,£ 
(P-values should be the same as EPO for small numbcrTand d"er g = 
at higher vah.es. as indicated by the lower bold line ) E-nlu^Z 
ss^ch and Fast a are snown to have good agreemem wi? h E?Q b™ 
underesumaie the significance slightly. Bl £r and wu-blash are 
o^rconfidenL with the degree of exaggerate dependent upon £ 
score. The results for fdwod-b were similar to those for rowm-B 
despite the difference in number of homologs detected. This graph 
could be used to roughly calibrate the reliabihty of a given staS 

ignored in previous tests but is essential for the straiehrforward 
or automatic interpretation of sequence comparison results 

Sh'k " PrOV J d f 3 ClMr indicati0n of ,he confidence that 
shou d be ascribed to each match. Indeed, the EPQ measure 
should approximate the expectation value reported bv data- 
base searching programs, if the programs estimates are accu- 

The Performance of Scoring Schemes. All of the programs 
tested could prov,de three fundamental types of scores The 
first score is the percentage identity, which may be computed 

the lengths of the sequences. The second is a "raw" or 

, he W^w " nan ^ Which " ,he measure optimized bv 
the Smith-Waterman algorithm and is computed bv summing 
the substitution matrix scores for each position in' the align- 
ment and subtracting gap penalties. In blast, a measure 
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Mortov.,. publieauoni have indicitd ,h,, a<I idemitv £* 

cipal reasons percentage identity does so ooorlv seem to h* 
atWe rmdlfnT 3 *' 0 ? and ^«' ' h "°«- 

From .? „~ ° f reS ' dUe ««bstitUtionS. 

From the fdbkjd-b analysis in Fie 3 we learn that tn*. 

SKAl" rdiable ^ M fa da ' ab " for 
ZSZ aJ, 8 nme ' US ° f 81 ,easl 150 res *ues. Because' one 
.TrS^'' ° f PrDle T 5 nas «"» over 62 re^dues! 

2^ ""£57 for »««nments «o be at least 70 r«,du« 

-thTp^^^^ 

Ss of Sf " Umber ° f diSUm de teS 

use of the hssp equation improves the value of oem-m... 
identity, bu, even this measure' can find o2 45? 0 f1.TSSS 

^w Scoref Smf,rw meaSUred " 3 Se " Uenee <ompa„so* 
,h«T r Stnith-Waterman raw scores perform better 

h n " " cenla J e 'dcntity (Fig. I ). but In-scalmg (7) provided no 
notable benefit ,n our analysis. It is necessary to be wvnrccta 
when using either raw or bit scores because ° StSSta 
cutoff score could yield a tenfold difference in EPQ However 

J 1° C " OOSe a PP ro P»*" thresholds beca^eThe 

matched and the size of the database Raw score th«h«M. 
also are affected by matrix and gap parameter 

Statisncal Scores. Statistical scores were introduced partly 
o overcome the problems that arise from raw scores ThU 
scoring scheme provjdK lfce bes , discrimma r ~ 

homologous protems and those which arc unrelated Most 
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™W£>*™^ Five dif.eren, sequence comparison methods „ e equaled Men 

I* EPO on this database, although a, higher .eve. of g error i, L^sl^^^^ «* «™ 
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likely, its power on be anributed to its incorooration of ™„ 

between statistical scores and acWm^Ll f a * recmem 
query (Fig r 4). The op-l^'^i^ »» 
slightly conservative estimate of the cha^ of T, J 004 
quences being found a. random ta a giveTau/n TT° *" 
E-valueofO.Ol indicates that roughlvone r^rEh "'l a " 
of this similarity should be found i E.Er ^ 
Neither raw scores nor percentage £KJ£5 J qUCries - 
■ this way. and these "results vandt tSe luitablilrv*,"^ 

orders of magnitude for 1% EPO for th^ HaS, m " lwo 
less, these results stronglv"u«est th« it None,he " 
fund arn e mallv approp?^ w?"L^ " 
liable than those from blast butalK, m0re re " 

confidence by more that! T^LZ ' .^™. e 



k«up = 1 is nearly as effective as ssWS TfLtJ h I ? STa 

slower than Fasta Irtun - t u,. • ». _ V 7 and 6 5 times 

Pasta ktuj =Tb\Xe7aier^«^ " S,,gh, ' y faS,er than 
In PDB90D-B where thirl 11% '? m,er P re,a We scores, 
best method Z Znrt onTv"^ ^ TciMio ^. 'he 
homologs (Fig. afK mtho^wh°L~ r th aI knOWn 
re anonships is wu-biastz Consequent? £ L r 
differences between Fasta Icud = i «« li™ J Ulat ,ne 
programs are unlikelyTb s^if^fS £ 

fount h ° m0,0gS b < 

h' "," . w,lh E-values can recognize >90K of ih* 

homologous pairs with 30-40% idemitv In rhie Z? f 

residues. Of serene* 0 '^fng "SS^SS 3? <5 ° 
identified by ssEARCHE-values However lltho 1 ,k are 
her of homologs grows at loweT «/,lT£ I' a,thou * h ,he 
falU off sharplV of ,de " , ' t > - ,he detection 

^ - - 40 " of homologs with 20-25% identity 
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Summary of sequ ence companson metho d, w„ h pdb«d. E 
Method ' ^ 



Fig. 6. Distribution and detection of h„_ i 
•how the distribution of horaoloZ ° ^ 0P rDB " rvB *»" 
■denary l».ng the meuur e^SvTboTh^f 10 ,ht " 

he number of these p*r, found bv he e l^l R,led re P ons 
(ssearch with E-values) at 1% EPO ab " c, " rch,n ? m «h°d 

prote.ns w lth <40?" c id em u ™< roB «o-B database conta.ns 
structurally identified homotogs I £ ^ °\ ,ha «"P»- ">«< 
'remelv far i„ Mouenee ..J^™ '^nt 1 "* 1 " lu " Merged es- 
alignments mav be naceuratrJnLli^ , ' den,,I >' No,e '»•' «*>e 
repons show .„„ sseaict ean^dem ? " icvels of identity. Filled 
or more idemi^bm " ^de"cMo„T re, " io ™ h 'F» have 
ConsequentK. the great seauen~ !i **** belo » 

identified «i|„nonS ^ re.auo^L e «"f "? .° f ■ 0,! " ru ««r»»v 
P»nw« sequence cc^p^^^"^ d "«« ">e abihrv of 

are detected and only 10% of those with l5_?nc-, u , 
These results show ,h aI statist <1 15 " 20?i, ". n be f °"nd. 

protetns whose identity "reSSvSl ^ f, " d rela,ed 
of the method is restricted \Z ^ the Jr^T h WCV "- the ^ 
protein sequences. g 31 d,ver 8 e nce of many 

After completion of this work = r,-„. . 
BU«r was released: B Jtg?S> uZ."™* ° f pairWile 
ments. like wu-BLArp^nd riii su PP°ns gapped align- 
initial ,« u on blastcp ™f„ S ^T* W " h sum s,al,s,ics °u 
E-values are reii^andT a f f P»™"««« show that i B 
was substantialK ° Ve f ra " deleC "° n of hom °'°!P 
qutte equal « X, ot ^ but « 

CONCLUSION 

ouence searches arc ma^ h! , . mosl cffcc »' v e se- 

«n which the Jro^nTctenc £ h^nT "7"' 

and r/o using siaiisiiwi «.« complexity masked 

L BLAS ^ underesnmate the true 



Relative Time 




ssearch % idennty: within alignment 
ssearch % identity, within both 
ssearch % identity: HSSP-scaJed 
ssearch Smith-Waterman raw scores 
ssearch E-values 
fasta letup - ] E-values 
fasta letup « 2 E-values 
wublast: P-values 
blast P.vaiues 

• ' «re from Urge da.ab^ searches with genome pro.e,^ 
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extern f errors. Second, ssearch. wu-blastz and fasta 
hup - l perform best, though blast and facta ktup = 2 
detect most f the relationships found by the best procedures 
and are appropriate for rapid initial searches 

' 0 * 0U4 P™"* *« are f und bv sequence com- 
panson can be djsnngtMshed with high reliability from the hu?e 
mi? , However - even the best database 

SSST* P roecdure$ ,es,ed «<> And «he large raajorirvTf 
d«^, evolutwnary relationships at an acceptable erro?£te 
Thm if the procedures assessed here fail to find a reliable 
match. « does not imply that the sequence is unique rathe, h 
wdtcates that any relatives i, might have are disum wes • ' 

••Additional and updated information about th» TT~ 
penury figures, mav be t^^^^f^ 
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Response Under 37 C.F.R. 1.116 - Expedited Procedure 

Examining Group 1652 
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I hereby certifrjhat thi§£ >rr&poAdence is being deposited with the United States Postal Service as first class mail in an envelope addressed to: 
Mail Stop A^al^rieffe tents, ttmrnissioner for Patents, P.O. Box 1450, Alexandria, Virginia 22313-1450 on . 

Bv: I J Printed: 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
BEFORE THE BOARD OF PATENT APPEALS AND INTERFERENCES 

In re Application of: Bandman et al. 

Title: HUMAN MITOCHONDRIAL MALATE DEHYDROGENASE 

Serial No.: 09/915,694 Filing Date: July 25, 2001 

Examiner: Fronda, C. Group Art Unit: 1652 

Mail Stop Appeal Brief-Patents 
Commissioner for Patents 
P.O. Box 1450 

Alexandria, Virginia 22313-1450 

BRIEF ON APPEAL 

Sir: 

Further to the Notice of Appeal filed on September 30, 2003, and received by the USPTO 
on October 6, 2003, herewith are three copies of Appellants' Brief on Appeal. Appellants hereby 
request a one month extension of time in order to file this Brief. Authorized fees include the 
statutory fee of $1 10 for a one-month extension of time, as well as the $ 330.00 fee for the filing 
of this Brief. 

This is an appeal from the decision of the Examiner finally rejecting claims 3, 6, 7, 9 and 
12 of the above-identified application. 

(1) REAL PARTY IN INTEREST 
The above-identified application is assigned of record to Incyte Pharmaceuticals, Inc. 
(now Incyte Corporation, formerly known as Incyte Genomics, Inc.) (Reel 9774, Frame 0975), 
which is the real party in interest herein. 
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(2) RELATED APPEALS AND INTERFERENCES 
Appellants, their legal representative and the assignee are not aware of any related 
appeals or interferences which will directly affect or be directly affected by or have a bearing on 
the Board's decision in the instant appeal. 

(3) STATUS OF THE CLAIMS 
Claims rejected: Claims 3, 6, 7, 9 and 12 
Claims allowed: (none) 
Claims objected to: Claims 4, 5 and 10. 
Claims canceled: Claims 1, 2, 8, 11, 13, 17-27, 30-45. 
Claims withdrawn: Claims 14-16, 28 and 29 

Claims on Appeal: Claims 3, 6, 7, 9 and 12 (A copy of the claims on appeal, as 

amended, can be found in the attached Appendix). 

(4) STATUS OF AMENDMENTS AFTER FINAL 
There were no amendments made after final. 

(5) SUMMARY OF THE INVENTION 

Embodiments of the present invention are directed, inter alia, to polynucleotides 

encoding mitochondrial malate dehydrogenase (MT-MDH). In particular embodiments, the 

mitochondrial malate dehydrogenases are selected from among amino acid sequences comprising 

SEQ ID NO:L These polypeptides have strong chemical and structural homology with known 

mitochondrial malate dehydrogenases. For example: 

MT-MDH is 338 amino acids in length and has two potential N-glycosylation 
sites at residues N-l 17 and N-145, seven potential casein kinase II 
phosphorylation sites at T-54, S-69, T-109, T-170, S-261, S-309, and S-310, four 
potential protein kinase C phosphorylation sites at residues T-213, T-227, S-326, 
and T-336, a mitochondrial malate dehydrogenase active site signature between 
residues V-169 and V- 181, and a transit peptide sequence from residues M-l to 
N-24. As shown in Figures 2 A and 2B, MT-MDH has chemical and structural 
homology with murine mitochondrial mitochondrial malate dehydrogenase (GI 
56643; SEQ ID NO:3) and porcine mitochondrial mitochondrial malate 
dehydrogenase (GI 164541; SEQ ID NO:4). In particular, MT-MDH and murine 
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mitochondrial mitochondrial malate dehydrogenase share 94% identity, share both 
potential N-glycosylation sites, six potential casein kinase II sites, three potential 
protein kinase C sites, the mitochondrial malate dehydrogenase active site 
signature, and the transit peptide sequence. As illustrated by Figures 3 A and 3B, 
respectively, MT-MDH and murine mitochondrial mitochondrial malate 
dehydrogenase (SEQ ID NO:3) have similar isoelectric points (pi = 8.8). As 
illustrated by Figures 4 A and 4B, MT-MDH contains potential NAD(H) and 
NADP(H) binding site motifs. Northern analysis shows the expression of this 
sequence in various libraries, at least 49% of which are immortalized or cancerous 
and at least 24% of which involve immune response. Of particular note is the 
expression of MT-MDH in fetal tissues; in cardiovascular, gut, nervous, and 
reproductive tissues; and in secretory and hematopoietic tissues. (Specification at 
page 14, line 29 to page 15, line 17). 

The polynucleotides of the present invention have a variety of utilities. For example, they 
can be used for the diagnosis, prevention, or treatment of vesicle trafficking, immunological and 
neoplastic disorders. (See the Specification e.g., at page 14, lines 18-21) 

(6) ISSUES 

1. Whether claims 3, 6, 7, 9 and 12 meet the written description requirement of 35 
U.S.C. § 112, first paragraph. 

2. Whether claims 3, 6, 7, 9 and 12 meet the enablement requirement of 35 U.S.C. 
§112, first paragraph. 

(7) GROUPING OF THE CLAIMS 

As to Issue 1 

All of the claims on appeal are grouped together. 
As to Issue 2 

All of the claims on appeal are grouped together. 

(8) APPELLANTS' ARGUMENTS 

Issue 1- Written description rejection under 35 U.S.C. § 112, first paragraph 

Claims 3, 6, 7, 9 and 12 were rejected under 35 U.S.C. §112, first paragraph, as allegedly 
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"containing subject matter which was not described in the specification in such a way as to 
reasonably convey to one skilled in the relevant art the inventpr(s) at the time the application was 
filed, had possession of the claimed invention." (06/20/03 Office Action, at page 4) In making 
the rejection, the Examiner asserts that: 

There is no disclosure of any particular structure to function/activity relationship in the 
single disclosed species. The specification also fails to described additional 
representative species of these polynculeotides by any identifying structural 
characteristics or properties for which no predictability of structure is apparent. Given 
this lack of additional representative species as encompassed by the claims, Applicants 
have failed to sufficiently disclose the invention in such full, clear, concise and exact 
terms that a skilled artisan would recognize Applicants were in possession of the claimed 
invention. (12/17/02 Office Action, at pages 4-5). 

This rejection is improper, as the claims define subject matter which is described in the 

Specification in such a way as to reasonably convey to one skilled in the art that the inventors 

had possession of the claimed subject matter at the time the application was filed. The 

requirements necessary to fulfill the written description requirement of 35 U.S.C § 1 12, first 

paragraph, are well established by case law: 

... the applicant must also convey with reasonable clarity to those skilled 
in the art that, as of the filing date sought, he or she was in possession of 
the invention. The invention is, for purposes of the "written description" 
inquiry, whatever is now claimed, Vas-Cath, Inc. v. Mahurkar, 19 
U.S.P.Q.2d 1111, 1117 (Fed. Cir. 1991). 

The Board's attention is also drawn to the Patent and Trademark Office's own 

Examination of Patent Applications Under the 35 U.S.C. £ 112, para. 1", published January 5, 

2001 , which provide that: 

An applicant may also show that an invention is complete by disclosure of 
sufficiently detailed, relevant identifying characteristics 42 which provide evidence 
that applicant was in possession of the claimed invention, 43 i.e., complete or 
partial structure, other physical and/or chemical properties, functional 
characteristics when coupled with a known or disclosed correlation between 
function and structure, or some combination of such characteristics. 44 What is 
conventional or well known to one of ordinary skill in the art need not be 
disclosed in detail. 45 If a skilled artisan would have understood the inventor to be 
in possession of the claimed invention at the time of filing, even if every nuance of 
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the claims is not explicitly described in the specification, then the adequate 
description requirement is met. 46 

Thus, the written description standard is fulfilled by both what is specifically disclosed 
and what is conventional or well known to one skilled in the art. 

A. The specification provides an adequate written description of the claimed 
"variants" of SEQ ID NO:l and SEQ ID NO:2. 

The subject matter encompassed by claims 3, 6, 7, 9 and 12 is either disclosed by the 
specification or is conventional or well known to one skilled in the art. 

First note that the "variant" language of independent claim 3 recites a polynucleotide 
encoding "a polypeptide comprising a naturally occurring amino acid sequence at least 95% 
identical to the amino acid sequence of SEQ ID NO:l" and the "variant" language of independent 
claim 12 recites "a polynucleotide comprising a naturally occurring polynucleotide sequence at 
least 95% identical to the polynucleotide sequence of SEQ ED NO:2." 

The amino acid sequence of SEQ ID NO: 1 and the polynucleotide sequence of SEQ ID 
NO:2 are explicitly disclosed in the specification. See, for example, the Sequence Listing. 
Variants of SEQ ID NO:l and SEQ ID NO: 2 are described in the Specification at, for example, 
page 4, lines 13-14; page 7, lines 4-7 and 12-18; and page 15, lines 18-27. 

One of ordinary skill in the art would recognize polynucleotide sequences which are 
variants having a polynucleotide sequence at least 95% to SEQ ID NO:2, or which encode 
polypeptide variants having an amino acid sequence at least 95% identical to SEQ ID NO:l. 
Given any naturally occurring polynucleotide sequence, it would be routine for one of skill in the 
art to recognize whether it was a variant of SEQ ID NO:2, or whether it encoded a variant of SEQ 
ID NO:l. Accordingly, the specification provides an adequate written description of the recited 
polynucleotide variants of SEQ ED NO:2 and polynucleotides encoding polypeptide variants of 
SEQIDNO:l. 

1. The present claims specifically define the claimed genus through the 
recitation of chemical structure 

Court cases in which "DNA claims" have been at issue commonly emphasize that the 
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recitation of structural features or chemical or physical properties are important factors to 

consider in a written description analysis of such claims. For example, in Fiers v. Revel, 25 

U.S.P.Q.2d 1601, 1606 (Fed. Cir. 1993), the court stated that: 

If a conception of a DNA requires a precise definition, such as by 
structure, formula, chemical name or physical properties, as we have held, 
then a description also requires that degree of specificity. 

_In a number of instances in which claims to DNA have been found invalid, the courts 

have noted that the claims attempted to define the claimed DNA in terms of functional 

characteristics without any reference to structural features. As set forth by the court in University 

of California v. Eli Lilly and Co., 43 U.S.P.Q.2d 1398, 1406 (Fed. Cir. 1997): 

In claims to genetic material, however, a generic statement such as 
"vertebrate insulin cDNA" or "mammalian insulin cDNA," without more, 
is not an adequate written description of the genus because it does not 
distinguish the claimed genus from others, except by function. 

Thus, the mere recitation of functional characteristics of a DNA, without the definition of 
structural features, has been a common basis by which courts have found invalid claims to DNA. 
For example, in Lilly, 43 U.S.P.Q.2d at 1407, the court found invalid for violation of the written 
description requirement the following claim of U.S. Patent No. 4,652,525: 

1. A recombinant plasmid replicable in procaryotic host containing within 
its nucleotide sequence a subsequence having the structure of the reverse 
transcript of an mRNA of a vertebrate, which mRNA encodes insulin. 

In Fiers, 25 U.S.P.Q.2d at 1603, the parties were in an interference involving the 
following count: 

A DNA which consists essentially of a DNA which codes for a human 
fibroblast interferon-beta polypeptide. 

Party Revel in the Fiers case argued that its foreign priority application contained an 
adequate written description of the DNA of the count because that application mentioned a 
potential method for isolating the DNA. The Revel priority application, however, did not have a 
description of any particular DNA structure corresponding to the DNA of the count. The court 
therefore found that the Revel priority application lacked an adequate written description of the 



118060 



6 



09/915,694 



Docket No.: PF-0379-1 DIV 

subject matter of the count. 

Thus, in Lilly and Fiers, nucleic acids were defined on the basis of functional 
characteristics and were found not to comply with the written description requirement of 35 
U.S.C. §112; i.e., "an mRNA of a vertebrate, which mRNA encodes insulin" in Lilly, and "DNA 
which codes for a human fibroblast interferon-beta polypeptide" in Fiers. In contrast to the 
situation in Lilly and Fiers, the claims at issue in the present application define polynucleotides 
and polypeptides in terms of chemical structure, rather than on functional characteristics. For 
example, the "variant language" of independent claims 3 and 12 recite chemical structure to 
define the claimed genus: 

3. An isolated polynucleotide encoding a polypeptide selected from the group 
consisting of: 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 
95% identical to the amino acid sequence of SEQ ID NO: 1 . 

12. An isolated polynucleotide selected from the group consisting of: 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at 
least 95% identical to the polynucleotide sequence of SEQ ID NO:2,.... 

From the above it should be apparent that the claims of the subject application are 
fundamentally different from those found invalid in Lilly and Fiers. The subject matter of the 
present claims is defined in terms of the chemical structure of SEQ ID NO:l and SEQ ED NO:2. 
In the present case, there is no reliance merely on a description of functional characteristics of the 
polynucleotides and polypeptides recited by the claims. In fact, there is no recitation of 
functional characteristics. Moreover, if such functional recitations were included, it would add to 
the structural characterization of the recited polynucleotides. The polynucleotides defined in the 
claims of the present application recite structural features, and cases such as Lilly and Fiers stress 
that the recitation of structure is an important factor to consider in a written description analysis 
of claims of this type. By failing to base its written description inquiry "on whatever is now 
claimed," the Office Action fails to provide an appropriate analysis of the present claims and how 
they differ from those found not to satisfy the written description requirement in Lilly and Fiers. 
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2. The present claims do not define a genus which is "highly variant" 

Furthermore, the claims at issue do not describe a genus which could be characterized as 
"highly variant." Available evidence illustrates that the claimed genus is of narrow scope. 

In support of this assertion* the Board's attention is directed to the enclosed reference by 
Brenner et al. ("Assessing sequence comparison methods with reliable structurally identified 
distant evolutionary relationships," Proc. Natl. Acad. Sci. USA (1998) 95:6073-6078). Through 
exhaustive analysis of a data set of proteins with known structural and functional relationships 
and with <90% overall sequence identity, Brenner et al. have determined that 30% identity is a 
reliable threshold for establishing evolutionary homology between two sequences aligned over at 
least 150 residues. (Brenner et al., pages 6073 and 6076.) Furthermore, local identity is 
particularly important in this case for assessing the significance of the alignments, as Brenner et 
al. further report that ^40% identity over at least 70 residues is reliable in signifying homology 
between proteins. (Brenner et al., page 6076.) 

The present application is directed, inter alia, to polynucleotides encoding mitochondrial 
malate dehydrogenases, including polynucleotides encoding mitochondrial malate 
dehydrogenases related to the amino acid sequence of SEQ ID NO:l. In accordance with 
Brenner et al, naturally occurring molecules may exist which could be characterized as 
mitochondrial malate dehydrogenases and which have as little as 30% identity over at least 150 
residues to SEQ ID NO:l. The "variant language" of the present claims recites, for example, 
polynucleotides encoding a polypeptide comprising "a naturally occurring amino acid sequence 
at least 95% identical to the amino acid sequence of SEQ ID NO:l" (note that SEQ ID NO:l has 
338 amino acid residues). This variation is far less than that of polynucleotides encoding all 
potential mitochondrial malate dehydrogenases related to SEQ ID NO:l, i.e., those mitochondrial 
malate dehydrogenases having as little as 30% identity over at least 150 residues to SEQ ID 
NO:l. 

3. The state of the art at the time of the present invention is further advanced 
than at the time of the Lilly and Fiers applications 

In the Lilly case, claims of U.S. Patent No. 4,652,525 were found invalid for failing to 
comply with the written description requirement of 35 U.S.C. §112. The '525 patent claimed the 
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benefit of priority of two applications, Application Serial No. 801,343 filed May 27, 1977, and 
Application Serial No. 805,023 filed June 9, 1977. In the Fiers case, party Revel claimed the 
benefit of priority of an Israeli application filed on November 21, 1979. Thus, the written 
description inquiry in those case was based on the state of the art at essentially at the "dark ages" 
of recombinant DNA technology. 

The present application has a priority date of September 3, 1997. Much has happened in 
the development of recombinant DNA technology in the 18 or more years from the time of filing 
of the applications involved in Lilly and Fiers and the present application. For example, the 
technique of polymerase chain reaction (PCR) was invented. Highly efficient cloning and DNA 
sequencing technology has been developed. Large databases of protein and nucleotide sequences 
have been compiled. Much of the raw material of the human and other genomes has been 
sequenced. With these remarkable advances one of skill in the art would recognize that, given 
the sequence information of SEQ ID NO:l and SEQ ID NO:2, and the additional extensive detail 
provided by the subject application, the present inventors were in possession of the claimed 
polynucleotide variants at the time of filing of this application. 

4. Summary 

The 12/17/02 and 6/30/03 Office Actions fail to base the written description inquiry "on 
whatever is now claimed." Consequently, the Office Actions do not provide an appropriate 
analysis of the present claims and how they differ from those found not to satisfy the written 
description requirement in cases such as Lilly and Fiers. In particular, the claims of the subject 
application are fundamentally different from those found invalid in Lilly and Fiers. The subject 
matter of the present claims is defined in terms of the chemical structure of SEQ ID NO:l and 
SEQ ID NO:2. The courts have stressed that structural features are important factors to consider 
in a written description analysis of claims to nucleic acids and proteins. In addition, the genus of 
polynucleotides defined by the present claims is adequately described, as evidenced by Brenner 
et al. Furthermore, there have been remarkable advances in the state of the art since the Lilly and 
Fiers cases, and these advances were given no consideration whatsoever in the position set forth 
by the Office Action. 

For at least the reasons set forth above, the specification provides an adequate written 
description of the claimed subject matter, and this rejection should be reversed. 
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Issue 2 - Enablement rejection under 35 U.S.C. § 112, first paragraph 

Claims 3, 6, 7, 9 and 12 stand rejected under 35 U.S.C. 112, first paragraph allegedly 
for lacking an enabling disclosure with respect to variants of SEQ ID NO: 1. The Examiner has 
specifically stated that these claims contain "subject matter which was not described in the 
specification in such a way as to enable one skilled in the art to which it pertains, or with which it 
is most nearly connected, to make and/or use the invention." (06/30/03 Office Action, at page 2- 
3) 

In making the rejection, the Examiner asserts that: 

'The specification does not provide guidance with respect to the specific 
structural/catalytic amino acids and the structural motifs essential for enzyme 
structure and activity function which cannot be altered. Thus searching for the 
specific nucleotides to change (deletion, insertion, substitution, or combinations 
thereof) in a polynucleotide is well outside the realm of routine experimentation 
and predictability in the art" (12/17/02 Office Action, at page 6). 

The Examiner makes a similar assertion in the 06/30/03 Office Action that "[t]he 
specification does not teach the specific amino acids that can be altered and yet still retain 
enzyme activity." (6/30/03 Office Action, at page 3). 

The first paragraph of 35 U.S.C. §112 requires that the Specification describe how to 
make and use the claimed subject matter. That requirement has been met in the present 
application. In particular, the Specification describes how to make and use naturally-occurring 
polypeptide variants of SEQ ID NO: 1 and polynucleotides encoding such variants. 

Independent claim 3 recites not only that the "variant" polynucleotides encode 
polypeptides that are at least 95% identical to SEQ ED NO: 1, but also have "a naturally- 
occurring amino acid sequence" Through the process of natural selection, nature will have 
determined the appropriate amino acid sequences. Given the information provided by SEQ ID 
NO:l (the amino acid sequence of MT-MDH) and SEQ ID NO:2 (the polynucleotide sequence 
encoding MT-MDH), one of skill in the art would be able to routinely obtain a polynucleotide 
encoding a polypeptide comprising "a naturally-occurring amino acid sequence at least 95% 
identical to the amino acid sequence of SEQ ID NO:l." Likewise for the "variant" 
polynucleotides defined by independent claim 12: "a polynucleotide comprising a naturally 
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occurring polynucleotide sequence at least 95% identical to the polynucleotide sequence of SEQ 

ID NO:2." For example, the identification of relevant polynucleotides could be performed by 

hybridization and/or PCR techniques that were well-known to those skilled in the art at the time 

the subject application was filed and/or described throughout the Specification of the instant 

application. For example: 

The terms "stringent conditions"or "stringency", as used herein, refer to 
- - . ' _ _the conditions for hybridization as defined by the nucleic„acid, salt, and 
temperature. These conditions are well known in the art and may be 
altered in order to identify or detect identical or related polynucleotide 
sequences. Numerous equivalent conditions comprising either low or high 
stringency depend on factors such as the length and nature of the sequence 
(DNA, RNA, base composition), nature of the target (DNA, RNA, base 
composition), milieu (in solution or immobilized on a solid substrate), 
concentration of salts and other components (e.g., formamide, dextran 
sulfate and/or polyethylene glycol), and temperature of the reactions 
(within a range from about 5°C below the melting temperature of the probe 
to about 20°C to 25°C below the melting temperature). One or more 
factors be may be varied to generate conditions of either low or high 
stringency different from, but equivalent to, the above listed conditions. 
(Specification at page 13, lines 1 1-21) 

In one aspect, hybridization with PCR probes which are capable of 
detecting polynucleotide sequences, including genomic sequences, 
encoding MT-MDH or closely related molecules, may be used to identify 
nucleic acid sequences which encode MT-MDH. The specificity of the 
probe, whether it is made from a highly specific region, e.g., 10 unique 
nucleotides in the 5' regulatory region, or a less specific region, e.g., 
especially in the 3' coding region, and the stringency of the hybridization 
or amplification (maximal, high, intermediate, or low) will determine 
whether the probe identifies only naturally occurring sequences encoding 
MT-MDH, alleles, or related sequences. (Specification at page 37, line 25 
to page 38, line 3) 

Probes may also be used for the detection of related sequences, and should 
preferably contain at least 50% of the nucleotides from any of the MT- 
MDH encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and derived from the nucleotide sequence 
of SEQ ID NO:2 or from genomic sequence including promoter, enhancer 
elements, and introns of the naturally occurring MT-MDH. (Specification 
at page 38, lines 4-8) 

See also Example VI, at page 50. 
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Thus, one skilled in the art need not make and test vast numbers of polypeptides that are 
based on the amino acid sequence of SEQ ID NO: 1. Instead, one skilled in the art need only 
screen a cDNA library or use appropriate PCR conditions to identify relevant 
polynucleotides/polypeptides that already exist in nature. By adjusting the nature of the probe or 
nucleic acid (i.e., non-conserved, conserved or highly conserved) and the conditions of 
hybridization (maximum, high, intermediate or low stringency), one can obtain variant 
polynucleotides of SEQ ID NO:2 which, in turn, will allow one to make the variant polypeptides 
of SEQ ID NO:l recited by the present claims. Furthermore, the Specification sets forth an assay 
for measuring malate dehydrogenase activity (Example X at page 52, lines 1 1-18). 

Accordingly, the document cited by the Examiner in the 12/17/02 Office action relating to 
structure-function relationships in proteins is simply not germane to whether one can make and 
use the polypeptide variants recited by the present claims [i.e., Attwood et al. (Comput. Chem., 
25(4):329-339, 2001)]. Likewise, the cited document relating to alleged difficulties in assigning 
protein function based on homology comparison is not relevant to making the claimed 
polynucleotide variants [i.e., Ponting (Brief. Bioinform., 2(1): 19-29, 2001)]. That is, regardless 
of the precise functional characteristics of the SEQ ID NO:l and SEQ ID NO:2 variants, one can 
still make the claimed polynucleotide variants using the disclosure provided by the present 
Specification. The polynucleotides could then be used in, for example, diagnostic testing, drug 
discovery, expression profiling, etc., as discussed in the Bedilion Declaration, filed April 16, 
2003. . 

Furthermore, the Board's attention is also directed to the enclosed reference by Brenner et 
al. ("Assessing sequence comparison methods with reliable structurally identified distant 
evolutionary relationships," Proc. Natl. Acad. Sci. USA (1998) 95:6073-6078). Through 
exhaustive analysis of a data set of proteins with known structural and functional relationships 
and with <90% overall sequence identity, Brenner et al. have determined that 30% identity is a 
reliable threshold for establishing evolutionary homology between two sequences aligned over at 
least 150 residues. (Brenner et al., pages 6073 and 6076.) Furthermore, local identity is 
particularly important in this case for assessing the significance of the alignments, as Brenner et 
al. further report that ^40% identity over at least 70 residues is reliable in signifying homology 
between proteins. (Brenner et al., page 6076.) 
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Claim 3 recites, inter alia, a polynucleotide encoding a polypeptide comprising "a 
naturally occurring amino acid sequence at least 95% identical to the amino acid sequence of 
SEQ ID NO:l." In accordance with Brenner et al, naturally occurring molecules may exist which 
could be characterized as MT-MDH proteins and which have as little as 30% identity over at 
least 150 residues to SEQ ID NO:l. The "95% variants" recited by the present claims have a 
variation that is far less than that of all potential MT-MDH proteins related to SEQ ID NO:l, i.e., 
those MT-MDH proteins having as little as 30% identity over at least 150 residues to SEQ ID 
NO:l. Therefore, one would expect the SEQ ID NO:l variants recited by the present claims to 
have the functional activities of a MT-MDH protein. 

As set forth in In re Marzocchi, 169 USPQ 367, 369 (CCPA 1971): 

The first paragraph of § 112 requires nothing more than objective enablement. 
[emphasis added] How such a teaching is set forth, either by the use of illustrative 
examples or by broad terminology, is of no importance. 

As a matter of Patent Office practice, then, a specification disclosure which 
contains a teaching of the manner and process of making and using the invention 
in terms which correspond in scope to those used in describing and defining the 
subject matter sought to be patented must be take as in compliance with the 
enabling requirement of the first paragraph of § 112 unless there is reason to doubt 
the objective truth of the statements contained therein which must be relied on for 
enabling support. 

Contrary to the standard set forth in Marzocchi, the Examiner has failed to provide any 
reasons why one would doubt that the guidance provided by the present Specification would 
enable one to make and use the recited variants of SEQ ID NO:l or SEQ ID NO:2. Hence, a 
prima facie case for non-enablement has not been established with respect to the variants of SEQ 
IDNO:l orSEQIDNO:2. 

For at least the above reasons, reversal of this rejection is requested. 



118060 



13 



09/915,694 



Docket No.: PF-0379-1 DIV 

(9) CONCLUSION 

For at least the reasons set forth above, reversal of the rejections under 35 U.S.C. § 1 12, 
first paragraph, based on lack of an adequate written description and on lack of enablement, is 
respectfully requested. 

If the USPTO determines that any additional fees are due, the Commissioner is hereby 
authorized to charge Deposit Account No. 09-0108. 
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APPENDIX - CLAIMS ON APPEAL 

3. An isolated polynucleotide encoding a polypeptide selected from the group consisting 
of: 

a) a polypeptide comprising the amino acid sequence of SEQ ED NO: 1 ; and 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 95% 
identical to the amino acid sequence of SEQ ID NO: 1. 

6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 3. 

7. A cell transformed with a recombinant polynucleotide of claim 6. 



9. A method for producing a polypeptide encoded by a polynucleotide of claim 3, the 
method comprising: 

a) culturing a cell under conditions suitable for expression of the polypeptide, wherein 
said cell is transformed with a recombinant polynucleotide, and said recombinant polynucleotide 
comprises a promoter sequence operably linked to a polynucleotide of claim 3, and 

b) recovering the polypeptide so expressed. 

12. An isolated polynucleotide selected from the group consisting of: 

a) a polynucleotide comprising the polynucleotide sequence of SEQ ID NO:2, 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
95% identical to the polynucleotide sequence of SEQ ID NO:2, 

c) a polynucleotide having a sequence complementary to a polynucleotide of a), 

d) a polynucleotide having a sequence complementary to a polynucleotide of b) and 

e) an RNA equivalent of a)-d). 
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S£3 SK^S&Sasr reUable strnctn ^ 

Smro E. B«om»-«. Cykus Chctow. and Dm j. p. Hubbard! 

Sequence comparison methodologies have evolved 
so no previously published tests has evaluT.^ ^ P 
of programs cornmonrv us« I T 0 e^™?'™ Ven,ons 
Bl^T (1) have changed, and wj£J? ^ P^hl h Trodu " 

threshoK co^ g ha lh^° m °o PUb, i. hed — — « * 

-.age idcn^TulS^r^^^ 
measures have never a «,.,iw, u , statistical scoring 

commonly in use have not been compared I 8 

panson work? That is what fraction o f "fluence com- 

,L,«f^ ,Ueaee com P aris °" methodologies 

^ffBSS— 33s 

debase, and the matched proteins w e ? e ma °ed a be I 
homologousorunrelated according to their membership oi Z 



b^r^L ' Seq " eDCe ""P^wod methods have 

ben assessed using proteins whose reiation.bip, art ka» L 
reliably from their structures and functions, m d«7rih^ 
the SCOP database f Murzin. A. G., Brenner S e1 H^bbanf "T" 1 
* Chothia C,l«;. Mo/. At 247 ^l^ne'TvlI: 
Uon tested the programs BUST [AJucbul, S F Gi.h w 

215 ,403-410], wu-buctj [AlUchul. S. F. & Gish W (imi 
Enzyme!. 266, 460-480], facta fP M «o« w T2 

frtii .i * * C0^,,,8 ,cbeB,es ^ '"-or rate 

of all algorithm, .. greatly reduced by using statistical sco^l 

cor?s C t F m * , , ebM n * tT U, " n id.ot rv or rTw 

scores. The E- value statistical scores of sseAch a *,i r \ J 

well with the score, report*. However, the P-values repined 
by blast .od wii«BLAST3 exaggerate significance bv orders of 
maguitude. ssearch. facta ktup = l, ud wu-blu^ 

between protein, whose sequence identities art >30fc rll 
»on > distant* related proteins, they do much 7«s, S 0 *> 
one-balf of the relationship, between proteta, ^,h l!.^ 
identity are found. Because many bomoloes have low »1. 
stmilarity, mos, distant relationships ^"L^Z™ 
any p.irw.se comparison method; however, tho,< whica .« 
identified may be used with confidence. 

Sequence database searching plays a role in virtually every 
branch of molecular biology and is crucial for interpreting me 
sequences tssutng forth from genome projects «S he 
methods central role, i, is surprising that overa^ and reUuve 
capab, ,„es of different procedures are largely unknot lul 
difficult , 0 verify algorithms on sampfe data becaTe h 
requires large data sets of proteins whose evolutionary rela 
t.onships are known unambiguously and independently ofThe 
me hods bemg evaluated. However, nearly all E ho 
mologs have been identified bv sequence analysis ahe^.hnrf 
to be tested). Also, i , is generally v'ery difSfto S^Sl 
absence of structural data, whether two proteins that lack "liar 
sequence stmtlanty are unrelated. This has mean. , ha: a" 
«hou gh previ0 evaluations nave „ «•» a' 

companson they have suffered from insufficient. imper"ec,lv 
charactertzed. or artificial test data. Assessment also hL h^n 
problemanc because high quality database «^n£»rcS 
attempt to have both sensitivity (detection of homo og^ 
specficy (rejection of unrelated proteins): howeveT these 
complementary goals are linked such tha incSg one 
causes the other to be reduced. "easing one 

The publicauon con. of this article «ere defrayed .n pan bv oaee ch a „. 
payment. Thu aruele mu,, iherefore be hereby '**kJ!£ZZ££% 
accordance w IIh 18 UAC. 51734 » le ,y ,o .nd^,,, Z ("7 



Abbrevution: EPO. enors per qucrv 

*To whom reprints request* should be addressed e m..i k 
hyper.stanford.edu. -oorcsseo. e-mail: brenner^ 
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superfamilies, Pearson found thai modem matrices and "in- 
scaling" f raw sc res improve results. c nsiderablv He also 
«Poned that the rig rous Smith- Waterman algorithm worked 
shghtry better than Fasta, which was in turn more effective 
uian blast. 

Very large scale analyses of matrices have been Derformi-H 
(10). and Hemkoff and Henikoff („> J^SCSmS 
effectjvcneK of bi>st and fast* Their test with bu£ 
considered the ability to detect homolop above a predeter- 
rnajed score but had no penalty for methods which also 
reported large numbers of spurious matches. The Henikoffs 
searched the swiss-prot database (12) and used namfiS 
to deHne homologous famffles. Their results showed that the 
NMUMtt matrix (14) performed markedly better than Se 

bTen wI U !r fAM ' ,er,eS """^ ° 5) ' had 
A crucial aspect of any assessment is the data that are used 
to test the ability of the program to find homoloes. But in 
Pearson s and the Henikoffs' evaluations of sequence com" 
panson. the correct results were effectively unknown. This is 
because the superfamilies in MR and PRosrrc are princmallv 
created by using the same sequence comparison 3o* 
which are being evaluated. Interdependent of dTta an^ 

e?ama£ ' " Chi ?!T aHd egS " P^WerTand me^sTo! 

example, that new methods would be penalized for correctly 
identifying homologs missed by older programs. For irma„~ 
immunoglobulin variable and constanToomains a« 
homologous, but pir places them in different superfamilies 
The problem is widespread: each superfamilv in pir 48.00 with 
a structural homolog is itself homologous to an average of 6 
other pir superfamilies (16). 

JnTTT' ^ 50115 0f diff,cul »«. Sander and Schnei- 
der (17) used protem structures to evaluate sequence com- 
parson. Rather than comparing different sequence comparl 
son algorithms, their work focused on determining a S 
dependent threshold of percentage iden?£^? whSfSi 
protems would be of simUar structure. A r«ult of L an2vs* 
was the hssp equation; it states that proteins with 25% identiw 
over 80 residues will have similar structures, whereas £2 
alignments require higher identity. (Other studies also have 
used structures (18-20). bm these focused on a small dumber 
of mode proteins and were principally oriented toward eva" 
uat.ng alignment accuracy rather than homology detection 

A general solution to the problem of scormg comes f rom 
statistical measures (i.e.. E-values and P-values) based onTh" 
extreme value distribution (21). Extreme value scoring wl 
implemented analytically in the bi^st program usini th* 
Karlin and Altschul statistics (22. 23) and empirical 
preaches have been recently added to faSTa and ssearch In 

r 0 " "? be,n ?, heralded 33 8 reliab '« »»"ns of recogn^ine 
.gn ficamry similar proteins (24. 25). the mathematical 
lability of statistical scores "is a crucial feature of the B ££r 
algorithm" (1). The validity of this scormg procedure has 
tested anatytically and empirically (see reH and references" 
ref. 24). However, all large empirical tests used random 
sequences that may lack the subtle structure found w£ 

ULh , l08S - ^ 8l,h0u?h »"* ^"rchers have 
f£ ,t I $W, !f l,CaJ KOres * uscd 10 matches (24 25 
f« ' £?T. T 1,0 lar * e ri * orous e »Periments on bio og-' 
superior me ,hC degrtC ,0 WWch Such ""kings afe 

A Database for Testing Homology Detection. Since the 

discovery that the structures of hemoglobin and mvoriob n re 
very similar though their sequences are no. (29).' ° has been 
apparent that comparing structures is a more powerfuHif |«^ 
convenient) way to recognize distant evolutionary rela ioT 

"r"" 8 v*™™ " ,W0 P rotei « ^a high 
degree of similarity in their structural details and function it 
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limitations. vX^^S™?^™^^™ 

databases. One f PDBwrvm Lv^TJ ' (3 U 0) * nd two 
, , w,,c twBWD-B) Has domains, which were all <roat 

S2S. "iM&fS SSI'S? 12 38 

dalabaseandremovTdTromAeSt^'" mC HT n ? 0,8 
(and discarded) were al, K^'SSslTS^ 

level of identity to the selected h^.™ tv w«nold 

contains domains, which hav# o ciaa w 
distant relationships, or -o!Sof £ 2? i*Sffi SSLS 

of 1^™^ 53^^^i!^^^ ,, ^ 

Sln&JesT^ 

-y he found a, ht^p'™ ™> 
Analyses from both databases were cenTrtliv S w 

hr D ove f r e ° ndiS,am,yre,a '« 
Sies nT P 32T e wh "° n ,hC PM 0f 3 sma » »»mb« of 

B J?A o a ' VSeS . lesled B1 ^T (1). version 1.4.9MP. and wu- 
Bi^sTr (2). version 2.0al3MP. Also assessed was he Lrrl 
package, version 3.0.76 (3). which provided FA^ri and^ 
search a'T IT'™™ °1 Smifh-WuiTcsTS 
-12™ 7 i n e W iM BL ° SUM4i W " h ?a P P enalli « 
sum,, we7e ^dS^i.?"^'-- . 

(co^Sr™ 

from the database w as used as a qJrv to sea ch'h. 
This yieided ordered pa.rs of que'rv 

associated scores, which were soned. on the bas,s o"2 
scores, from best .o worst. The ideal method wou?d K 
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Smfth-Wj 



ft 



0.01 




0.001 

of comparison, were cons.dered us.n" £ W&'^J! T^"" *" h " rt ™* £ £££ n o ? B<0 ^ di,ibilt - W Analysis 
for statistical scores. raw scores, and three m«« " S * cheme » ™d assessed. The iranhs «h~3^ P">?r»m. The resulu of this smile 

ame fold dmdedby the total number of pairs fromVe^ "7" de,ened - Pre «"'v. .t ,s , he n^iLr «f * " U ">*«'= the Iraet.on o 
identification of 904 relationship,. The v axh reoom^kl ,u P*f family, pobmd-b contains arotal of ™044 hill f" oi P re,e "» *«»> 

companson. 13 error, corresponds to 0.01. or ]£ EPO ^ , ""^ ° f EP0 - a" J23'^eril T M,Korc ofl0% 

accuracy wh,ch may be destrVd. The scores ,haTcorr«^V *' B pretenled ° n * '°F «aie to sho7« u !« *!' ,he »•«>■■ »"•«-»" 
demonstrates the trade-off between sens^tv and ^3™" 1° * he ' evels of ^0 and coverage ,« sTt^ Wide,v var ™ degree, „ 

up). The ideal method would be ,„ ,he iZJ^^lT!^ more "omolop are found F * « Tab.e 1. The gr.ph' 



perfect separation, with all of the homologs at the ton of rh, 
list and unrelated proteins below. I„ praet^c perfec^erL 
..on ,s impossible ,o achieve so instead SK^^ 
drawmg a threshold above which there are theLl, "™k 
of related patrs of sequences consent J£ ^"LZTmI 
error rate. acceptable 

Our procedure involved measuring the coveraee an rf ~~ 
for every threshold. Coverage was defined 3 If™ , 
structurally determined homologs .ha, have ««« aboT.h 
selected threshold; this reflects the sensittvi^a ^th^. 
Errors per query (EPQ). an indicator c TSLStXT^ 
number of nonhomologous pairs above the threshold divided 
by the number of queries. Graphs of these da a called 
coverage vs. error plots. w ere devised to understand "££ 




•0CWMOa.... MMM1( . wwlD[?w;w 



proieins are noi relaieH a,**,..,—.,... V s * suggest thai ih«c 
score of 85 nor he E ^" a,elv - nel,h » raw allgnmen , 
Rasmol (40? "gn.f.cam. Prote.n, rendered by 



Srs'ha^cttv^^ ™~ 
cever Operating Charac ter^tic rRnrf"' 3 ' temtnt of Re " 
better represent the ^Tg J^ j' OU (33 ' 3*) b «' 
sequence comparison and thH L "curacy required in 
mologs. P " d ,he hu ? e b «kground of nonho- 

search. The EPQ -Sfi 

'ency: tha, is. it require Zo L Vr, h P ' Um ° n ^ COM "- 
, U er,es. C^^^^^^^g 
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E»cn po.ni (xoii me i»nptn .no 
oerceni laenrrry of an iripmwnt 
°* ,w ** n unrftiatea oroiains 




HSSP ThmihoJd 
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Al^nmtnt length 

SSE.RCH „ p,o„ed as /^,n w h osrp^Zn^ USPr0,t ' nS ' OUndw,,h 
"he pereentaae ideni.iv w, hln .1^ °" ,hc ,cn f ,h "d 

'en ? ,h and percentage 'ien "ar ^ ^ rd"^!!""" f ?nmen ' 
may have exactly the same aliinmen. In 1 * y Pi " s ot P r0,e "» 
The l.ne shows Ihe Hs"Thr«h„i^.K f ^ P 6 '""'^ 'dent.rv. 
*«h a different ^^T^ " " » - -PI*- 
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uJ ^ ° l BttttUMl * cores ,n E«h line show, 

^ P bCrWCCn rep ^ ncd " atm,cal *«« «d •ciuil error 
rate for a different program. E-values are reported for sseakch and 
Fasta. whereas P-vaiues are shown for blast and wu-blast If the 
scoring were perfect, then the number of errors per query and the 
E-vaJue, would be the same, as ind.ca.ed by the^up^ bo" line 
f P-values should be the same as EPO for smali numberTane ™LT a 
at higher values, as jndicated by the lower bold line.) E-values from 
ssearch and fasta are shown to have good agreement with EPO but 
underestimate the significance slightly, blast and wu-blast J 
overconfident, with the degree of exaggeration dependent upon £ 
score. The results for pdimod-b were similar to those for rrW B 
despite the difference in number of homologs detected This firaoh 
«ore ^ 10 r ° Ughiy " libralt the rel,abil ». v of a g^en statistical. 

ignored in previous tests but is essential for the straiehrforward 
or automatic interpretation of sequence comparison results 
Further 11 provides a dear indication of the confidence that 
should be ascribed to each match. Indeed, the EPQ measure 
should approximate the expectation value reported bv data- 
base searching programs, if the programs' estimates are accu- 
rate. 

The Performance of Scoring Schemes. All of the programs 
tested could provide three fundamental types of scores Tne 
first score is the percentage identity, which mav be computed 
in several ways based on either the length of the alignment or 
the lengths of the sequences. The second is a -raw" 0 r 
"Smith-Waterman" score, which is the measure optimized bv 
the Smith-Waterman algorithm and is computed bv summing 
the substitution matrix scores for each position in the alien, 
ment and subtracting gap penalties. In bi^st. a measure 
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Moreover, publ.cations have indicated tW25* iden™ cf£ 

high .eveis of identity Sd t g £n? 

Despite the h.gh identity, the raw and the statS S c„~ f«; 
such .ncorrect matches are typicallv no" ,32 TW n 
«pal reasons percentage idemiry does 
that ,t ignores information about gaps mdiSS !he «n,« 
vat.ve or radical nature of residue'sutethu.loVs " 
From the pdbmd-b analysis in Fie 3 we leam th« w 
•dentity is a reliable threshold fo r'i hi databaTe only to 

httSf ° f P,0,e, ? S h " idemi » ° v « 62 retiduS 
n is probably necessary for alignments to be at least 70 rwidua 
m ingtta before 40% is a reasonable threshold fo a daTaS 
of this particular size and composition oataoase 
At a given reliability, scores based on percentaee identity 

SSffj^rsr oi ,he dis,a,,, 

statistical scoring. If one measures the percentaee idemitv 
the aligned regions without considerat.oKSenUen^T 
S s e e n of ?H eg,i?ible nUmber ° f distam homo oTa^e detect 
S HSSP e ? U8,,0n im P roves ,he "life of percental 
identity, but even this measure can find onlv of all kn«S 

Raw Scores. Smith-Waterman raw scores perform better 

notX K em r Se ,dem " y (Fi «- 1 >• bm ,n - $ »''"S (7) provided „o 
notable benefit ,n our analysis. It is necessary to be vervprecbe 
when using euher raw or bit scores because a WSEHto 
cutoff score cou.d yield a tenfold difference in EPQ. Howeve? 
e. 1 1° eh °° ?e a PP r °P"«" 'hresholos becauseThe 

matched and the size of the database. Raw score thr«h« m. 
also are affected by matrix and gap parameters 

Statistical Scores. Statistical scores were introduced partly 
io overcome the problems that arise from raw scores ffl 

hS 8 $Chemt PfOVidM ,he besl discrimination 
homologous proiems and those wh.ch are unrelated M«S 
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Cowrag* 

^{^-^•^^ Comp „,J mtthotii .„ evalualed eaeh 

ai 1% EPO. FASTA klup - 1 , nd wu-BLASTj .re almost as B - ! me ' hod a ,ne "E*RCH. wh.ch f.nds I89i of relahon»h,« 

a. EPO on , his d , tab «. allh o Ugh a. hlghe r o?e^L^ .He 
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between statistical scores and m2£*?^ a « f ««n«t 

*l«htly conservative estimate of the AiSofl*. ?O0d - 
quences being found a. random ^\g^Tou ' ^° " 
E-value fO.Ol indicates that rouBhKone^,^°'J hus - an 

Neither raw scores nor percentage Srv^n i qUeries - 
« *is way. and these Tesuiu fi^Z*?? 1 

orders of magnitude for 1% EPQ S "SSbS n^" T 
less, these results sironelv suturest ihaTfh, ■ Nonethe- 
fundamentally appropXe JwJSZ t n ,he ° rV is 

liable than ^iZ^ ^^Z^t ™ ^ 
confidence^ more than an order of nu^ffi « j^S? 

Overall Detection of Homolocs and r^» " , ^ EP ° 
rithms. The resuj* in Fig . ^ ^tSi^TZZ 
sequence comparison is capable of idemS , ^ ,rW,se 

.denuf.es 15% was , he wors^performer whe^^J 
•"up = J , s nearly as effective as ssearch Fasta kuT- o i 
wu-bi^t? are intermediate in the. *bi£? k 7„ l"" 
-nologs. Comparison of different .MS n V"!? 
those capable of identifying more h™ 0 lo£ a r e ^ 

relationships is wtj-blastz. Consequent w?Sf« * T- 
dtfferences between fasta kup = 1 3™ JSfS, , J!' ,he 
programs are unlikely to be sLificanf^' ^ 

founl KESSS" ^ mOS ' d ' Stam h °™'°-* SE. be 
«h;~ u • se< » uence comparison: a great manv such relation 
fhips have no more sequence identify than would be exSed" 
by chance, ssearch with E-values can recognize >*«5 ,t 
homologous pairs with 30-40% identity lThl!^ [ lhe 
are 30 pa^ of IM^hS^^^*™ 

* E - va ' u «- but 26 of these tnvolve sequences with ^ri 
r f du «- Of sequences having 25-30% idemi* 75* ? 
identified by ssearch E-valu« However althouoi .1 are 

T,We * S »""°"y of sequence comparison methods w,h 
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dcnutv , us.Bg ^ meajure of JJ^'™ ™M «cord.„ g „ , heir 
the number of these paus found bv .he Fi " ed re P ons '""'"le 

(ssearch with E-v,lu a ) „ 1% ETO 1 " e,elrCiun ? me 'hod 
proiern w ,,h <40% i denlllv and^th ^ daUbue con, »'« 
structural identified homoVE •£ S "V*" ? " Ph - raoM 
tremely f ar in * » ™ e d,M **» dn-erged «- 

alignments may be .naccuratreJoVdalK ^? ,dem,l> ' No,e *" ■* 
"?iom show that ssearch can^^V °" ltVcb of iden, " v - R»«J 
2S« or more .den^b™ 2 ' eli »'«>»h.ps th„ h^ 
ConseouentK. the grea, seoln- h " Wan " lha ""> M °* 
xteniified cvol ullo „arv re.",oX» ° f mOS * 

Panwe seouenc, compan^nt d^teenhem" def " B ^ 0> 

P~««i«BwhbKid M ii W is\em,Svr^i * fmd relaIed 
of the method is ^^^i^^^P^ 
protein sequences. ' g d,ver «ence of many 

After completion of this wort = 
Bl^ST wa, released: blaW?37) 1 Z,™™ ° f PairWiSe 
menis. like wu-BLASr\hd1 di, -1 ^P 0 " 5 ?3p P ed ali ?n- 
initial tests on bi^stcp W " h SUm S,a,,$,ics °" 

E-values are reli^and^f d P»™n.eten show that ,u 
»as substan.S ^better than ZTZT """T ° f 
quite equal , 0 that of wJ-b """^ ^ buI «™ 

CONCLUSION 

a^d f"e"t C r,he n r SUS am ° ngS ' "P" ls « s « refs. 7. 24 25 V 

o U e n cVs e e"crestmad e S b^r u \^' a ; a he m ° S ' 
-n which the proiemTeouence hAM CUrfem da,abase 
and „„ usinesta, s, caTscori ,o com P'«»v -"asked 

«pe ri mentsf ull v Sup ^^:el ,merPret ,he °» 

SSS bl ?iFf - s= 

. ana wl- blast: undcresumaie the true 
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, enem f errors. Second, ssearch. wv-bust and facta 
ktup - 1 perf rro best, though bust and fasta ktuo = 2 
detect most f the relationships f und by the best procedures 
and are appropriate f r rapid initial searches 

The hom log us proteins that are found bv' sequence com- 
panson can be distinguished with high reliability from the Ce 
number of unrelated pain. However, even the bes^daub je 
searching procedures tested fail to find the large maiorirvTf 
dmam evoluuonary relationships a. an acceptable?™?™ 
Thufc ,f the procedures assessed here fail to fuid a reliable 
match, it does not imply that the sequence is unique: rather ix 
indicates that any relatives it might have are distant one 

"Addiuonal and updated information about this wort include. 
supplementary figures, may be found a, hnp://^,^^^ 

m^VSZV" ' ? le ' Ul 10 °"" A G - M»rzin. M. Levitt S. R.Eddv 
and G. Mitchison for valuable discussion. S.E.B w>< JZj~ZL 
supponed by a St John . College ((Umbri^K) ^"^- 
Setolarship and by the A»«nc»n Friends of Abridge UnS" 
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